Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence
نویسندگان
چکیده
We study the problem of identifying the best arm(s) in the stochastic multi-armed bandit setting. This problem has been studied in the literature from two different perspectives: fixed budget and fixed confidence. We propose a unifying approach that leads to a meta-algorithm called unified gap-based exploration (UGapE), with a common structure and similar theoretical analysis for these two settings. We prove a performance bound for the two versions of the algorithm showing that the two problems are characterized by the same notion of complexity. We also show how the UGapE algorithm as well as its theoretical analysis can be extended to take into account the variance of the arms and to multiple bandits. Finally, we evaluate the performance of UGapE and compare it with a number of existing fixed budget and fixed confidence algorithms.
منابع مشابه
Tight (Lower) Bounds for the Fixed Budget Best Arm Identification Bandit Problem
We consider the problem of best arm identification with a fixed budget T , in theK-armed stochastic bandit setting, with arms distribution defined on [0, 1]. We prove that any bandit strategy, for at least one bandit problem characterized by a complexityH , will misidentify the best arm with probability lower bounded by exp ( − T log(K)H ) , whereH is the sum for all sub-optimal arms of the inv...
متن کاملBest-Arm Identification in Linear Bandits
We study the best-arm identification problem in linear bandit, where the rewards of the arms depend linearly on an unknown parameter θ and the objective is to return the arm with the largest reward. We characterize the complexity of the problem and introduce sample allocation strategies that pull arms to identify the best arm with a fixed confidence, while minimizing the sample budget. In parti...
متن کاملOn the Complexity of Best-Arm Identification in Multi-Armed Bandit Models
The stochastic multi-armed bandit model is a simple abstraction that has proven useful in many different contexts in statistics and machine learning. Whereas the achievable limit in terms of regret minimization is now well known, our aim is to contribute to a better understanding of the performance in terms of identifying the m best arms. We introduce generic notions of complexity for the two d...
متن کاملBayesian Best-Arm Identification for Selecting Influenza Mitigation Strategies
Pandemic influenza has the epidemic potential to kill millions of people. While various preventive measures exist (i.a., vaccination and school closures), deciding on strategies that lead to their most effective and efficient use, remains challenging. To this end, individual-based epidemiological models are essential to assist decision makers in determining the best strategy to curve epidemic s...
متن کاملExploiting correlation and budget constraints in Bayesian multi-armed bandit optimization
We address the problem of finding the maximizer of a nonlinear smooth function, that can only be evaluated point-wise, subject to constraints on the number of permitted function evaluations. This problem is also known as fixed-budget best arm identification in the multi-armed bandit literature. We introduce a Bayesian approach for this problem and show that it empirically outperforms both the e...
متن کامل